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Abstract 

We prove the following results concerning the combinatorics of list decoding, motivated 
by the exponential gap between the known upper boimd (of 0(1/7)) and lower boimd (of 
Qp{log{l/j))) for the list-size needed to decode up to radius p with rate 7 away from capacity, 
i.e., 1 - h{p) - 7 (herep e (0, 1/2) and 7 > 0). 

• We prove that in any binary code C C {0, 1}" of rate 1 — h{p) — 7, there must exist a set 
C c C of flp{l/y/j) codewords such that the average distance of the points in C from 
their centroid is at most pn. In other words, there must exist fip(l/y^) codewords with 
low "average radius". The motivation for this result is that it gives a list-size lower bound 
for a strong notion of list decoding; this strong form has been implicitly been used in the 
previous negative results for list decoding. (The usual notion of list decoding corresponds 
to replacing average radius by the minimum radius of an enclosing Hamming ball.) 

The remaining results are for the usual notion of list decoding: 

• We give a short simple proof, over all fixed alphabets, of the above-mentioned 17^, (log ( 1 /7) ) 
lower boimd due to Blinovsky. 

• We show that one cannot improve the r2p(log(l/7)) lower bovind via techniques based 
on identifying the zero-rate regime for list decoding of constant-weight codes (this is a 
t5^ical approach for negative results in coding theory, including the r2p(log(l/7)) list size 
lower bound). On a positive note, out f2p(l/ ^) lower bound for the strong form of list 
decoding does circumvent this barrier. 

• We show a "reverse connection" showing that constant-weight codes for list decoding 
imply general codes for list decoding with higher rate. This shows that the best possible 
list-size, as a function of the gap 7 of the rate to the capacity limit, is the same up to 
constant factors for both constant- weight codes and general codes. 

• We give simple second moment based proofs that w.h.p. a list-size of np(l/7) is needed 
for list decoding random codes from errors as well as erasures, at rates which are 7 away 
from the corresponding capacities. For random linear codes, the corresponding list size 
bounds are flp{l/j) for errors and exp(np(l/7)) for erasures. 
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srivatsa@cs . emu . edu 



1 Introduction 



The list decoding problem for an error-correcting code C C consists of finding the set of all 
codeword of C with Hamming distance pn from an input string y G S". Though it was originally 
introduced in early work of Elias and Wozencraft [5, 14] in the context of average decoding error 
probability estimation for random error models, recently the main interest in list decoding has 
been for adversarial error models. List decoding enables correcting up to a factor two more worst- 
case errors compared to algorithms that are always restricted to output a unique answer, and this 
potential has even been realized algorithmically [9, 7, 11]. 

In this work, we are interested in some fundamental combinatorial questions concerning list 
decoding, which highlight the important trade-offs in this model. Fix p G (0, 1/2) and a posi- 
tive integer L. We say that a binary code C C {0, 1}" is (p, L) list decodable if every Hamming 
ball of radius pn has less than L codewords.^ Here, p corresponds to the error-fraction and L to 
the list-size needed by the error-correction algorithm. Note that (p, L)-list decodability imposes 
a sparsity requirement on the distribution of codewords in Hamming space. A natural combina- 
torial question that arises in this context is to place bounds on the largest size of a code meeting 
this requirement. In particular, an outstanding open question is to characterize the maximum rate 
(defined to be the limiting ratio ^"^j*"^^ as n — )• oo) of a (p, L)-list decodable code. 

By a simple volume packing argument, it can be shown that a {p, L)-list decodable code has 
rate at most l — /i(p) + o(l). (Throughout, for x G [0, 1/2], we use to denote the binary entropy 
function at x.) Indeed, picking a random center x, the Hamming ball B(x,pn) contains at least 
\C\ ■ (p"j2-" = |C| • 2-(i-'^(p)+°(i))" in expectation. Bounding this by (L - 1), we get the claim. 
On the positive side, in the limit of large L, the rate of a (p, L)-list decodable code approaches the 
optimal 1 — h{p). More precisely, for any 7 > 0, there exists a (p, l/7)-list decodable code of rate at 
least 1 — h{p) — 7. In fact, a random code of rate 1 — h{p) — 7 is (p, l/7)-list decodable whp [15, 6], 
^, and a similar result holds for random linear codes (with list-size Cp/7) [8]. In other words, a 
dense random packing of 2'^^^^'^'')^'^)" Hamming balls of radius pn (and therefore volume 2^^^^^ 
each) is "near-perfect" whp in the sense that no point is covered by more than 0(1/7) balls. 

The determination of the best asymptotic code rate of binary {p, L)-list decodable codes as p, L 
are held fixed and the block length grows is wide open for every choice of p G (0, 1/2) and integer 
L > 1. However, we do know that this rate tends to 1 — /i(p) in the limit of large L — 00. To 
understand this rate of convergence as a function of list size L, following [8], let us define Lp^-y 
to be the minimum integer L such that there exist {p, L)-list decodable codes of rate 1 — h{p) — 7 
for infinitely many block lengths n (the quantity 7 is the "gap" to "list decoding capacity"). In [1], 
Blinovsky showed that a {p, L)-list decodable code has rate at most l — h{p) — 2^®p(^) . In particular, 
this implies that for any L < 00, a {p, L)-list decodable code has rate strictly below the optimal 
1 — h{p). Stated in terms of ^^,7, his result gives Lp^^ > r2p(log(l/7)). We provide a short and 
simple proof of this lower bound in Section 4, which also works almost as easily over non-binary 
alphabets. In contrast, Blinovsky's subsequent proof for the non-binary case involved substantial 
technical effort [3, 4]. 

Observe the exponential gap (in terms of the dependence on 7) between the 0(1/7) upper 
bound and S7p(log(l/7)) lower bounds on the quantity Lp 7. Despite being a basic and fundamen- 
tal question about sphere packings in Hamming space and its direct relevance to list decoding, 

^This differs from the traditional definition of (p, L)-list decodability, which require at most L codewords. The 
modified definition ends up being more convenient for our purposes in this paper. Further, we are interested in the 
regime of large L where the two definitions are almost equivalent. 

^By using random coding with expurgation, the list size can be improved to h{p) /'y. 
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there has been no progress on narrowing this asymptotic gap in the 25 years since the works of 
Zyablov-Pinsker [15] and Blinovsky [1]. This is the motivating challenge driving this work. 

1.1 Prior work on list-size lower bounds 

We now discuss some lower bounds (besides Blinovsky's general lower bound) on list-size that 
have been obtained in restricted cases. 

Rudra shows that the Op{l/^) bound obtained via the probabilistic method for random codes 
is, in fact, tight up to constant factors [13]. Formally, there exists L = Q.p{l/^) such that a random 
code of rate 1 — h{p) — 7 is not {p, L)-list decodable w.h.p. His proof uses near-capacity-achieving 
codes for the binary symmetric channel, the existence of which is promised by Shannon's theorem, 
followed by a second moment argument. We give a simpler proof via a more direct use of the 
second moment method. This has the advantage that it works uniformly for random general as 
well as random linear codes, and for channels that introduce errors as well as erasures. 

Guruswami and Vadhan [10] consider the problem of list size tradeoff when the channel may 
corrupt close to half the bits, that is, when p = 1/2 — e, and more generally p = 1 — 1/q — e for 
codes over an alphabet of size q. (Note that decoding is impossible if the channel could corrupt 
up to 1/2 fraction of bits.) They show that there exists c > such that for all e > and all block 
lengths n, any (1/2 — e, c/e^)-list decodable code contains Oe(l) codewords. For ^bounded away 
from 1/2 (or 1 — 1/g in the g-ary case), their methods do not yield any non-trivial list-size lower 
bound as a function of gap 7 to list decoding capacity. 

1.2 Our main results 

We have already mentioned our new proofs of Blinovsky's lower bound for general codes, and 
the asymptotically optimal list-size lower bound for random (and random linear) codes. 

Our main results are motivated by the above-mentioned approaches, based on a strong form 
of list decoding, used in [1, 10] to establish list-size lower bounds. In this work, we formally define 
the notion of (p, L)-strong list decodability of a code underlying these proofs. This notion is a very 
natural one: a code is {p, L)-strongly list decodable if for every L codewords, the average distance 
of their centroid from the L codewords exceeds pn. Note that this is a stronger requirement than 
(p, L)-list decodability where only the maximum distance from any center point to the L codewords 
must exceed pn. 

We are able to prove nearly tight bounds on the achievable rate of a {p, L)-strong list decodable 
code. To state our result formally, denote by L^^°^^ the minimum L such that there exists a {p, L)- 
strongly list decodable code family of rate 1 — /i(p) — 7. A simple random coding argument shows 
that a random code of 1 — /i(p) — 7 is (p, 1 / 7)-strongly list decodable (matching the list decodability 
of random codes). That is, Lp|!^"^ < I/7. Our main technical result is a lower bound on the list 
size that is polynomially related to the upper bound, namely Lp|™"^ > rjp(7~^/^). 

1.3 Our other results 

We also make several clarifying observations on the landscape of the bounds for list-decodable 
codes, as well as the general methodology of proving combinatorial limitations of list-decodable 
codes. Many negative results in coding theory (i.e., results which place an upper bound on rate) 
proceed via a typical approach in which they pass to a constant weight A G (p, 1/2]; that is, restrict 
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the codewords to be of weight exactly An. They show that under this restriction, a code with the 
stated properties must have a constant number of codewords (that is, zero rate). Mapping this 
bound back to the unrestricted setting one gets a rate upper bound of 1 — h{\) for the original 
problem. For instance, the Elias-Bassalygo bound for rate R vs. relative distance 5 is of this nature 
(here A is picked to be the Johnson radius for list decoding for codes of relative distance 5). 

The above is also the approach taken in Blinovsky's work [1] as well as that of [10]. We show 
that such an approach does not and cannot give any bound better than Blinovsky's $7p(log(l/7)) 
bound for Lp More precisely, for any A > p + for some Cp > 0, we show that there exists 
a {p, L)-(strongly) list decodable code of rate S7p,L(l). Thus in order to improve the lower bound, 
we must be able to handle codes of strictly positive rate, and cannot deduce the bound by pinning 
down the zero-rate regime of constant-weight codes. This perhaps points to why improvements 
to Blinovsky's bounds have been difficult. On a positive note, we remark that we are able to effect 
such a proof for strong list decodability (some details follow next).'^ 

To describe the method underlying our list-size lower bound for strongly list-decodable codes, 
it is convenient to express the statement as an upper bound on rate in terms of list-size L. Note that 
a list-size lower bound of L > r2p(l/ ^) for {p, L)-strongly list-decodable codes of rate 1 — h{p) — 7 
amounts to proving an upper bound of 1 — h{p) — $7p(l/L^) on the rate of (p, L) -strongly list 
decodable codes. Our proof of such an upper bound proceeds by first showing a rate upper 
bound of h{\) — h{p) — Q.p{l/Lp') for such codes whose codewords are restricted to all have weight 
An (for a suitable choice of A E (p, 1/2]). To map this back to the original setting (with no weight 
restrictions on codewords), one simply notes that every (p, L)-strongly list decodable code of rate 
R has a constant A-weight subcode of rate R — {1 — h{\)). 

Generally speaking, by passing to a constant-weight subcode, one can translate combinatorial 
results on limitations of constant-weight codes to results showing limitations for the case of gen- 
eral codes. We are not aware of a reverse connection (for any of the standard combinatorial coding 
problems) that allows one to translate limitations for general codes into corresponding limitations 
for constant-weight codes. This leaves open the possibility that the problem of showing limita- 
tions of constant-weight codes may be harder than the corresponding problem for general codes, 
or worse still, have a different answer making it impossible to solve the problem for general codes 
via the methodology of passing to constant-weight codes. 

We show that for the problem of list decoding this is fortunately not the case, and there is 
in fact a reverse connection of the above form. Formally, we prove that a rate upper bound of 

l—h{p)—jp^L for -list decodable codes implies a rate upper bound of /i( A)— /i(p)—7p,L ( ynz^] 



for (p, L)-list decodable codes whose codewords must all have Hamming weight An. A similar 
claim holds also for strong list decodability, though we don't state it formally. 

1.4 Our proof techniques 

Our proofs in this paper employ variants of the standard probabilistic method. We show an ex- 
tremely simple probabilistic argument that yields a Qp{log{l/^)) bound on the list size of a stan- 
dard list decodable code; we emphasize that this is qualitatively the tightest known bound. For 
the "strong list decoding" problem that we introduce, we are able to improve this list-size bound 
to Op(l/^). The proof is based on the idea that instead of picking the "bad list decoding center" 

^Though the technical details are very different, it may be worth noting the similarity of this with boimds for rate 
vs. distance. Passing to the zero-rate regime for constant-weight codes gives the Elias-Bassalygo bound, and the more 
sophisticated and stronger second linear programming bound is obtained by working in the regime of positive rate. 
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uniformly at random, one can try to pick it randomly very close to a codeword, and this still gives 
similar guarantees on the number of near-by codewords. Now since the quantity of interest is the 
average radius, this close-by codeword gives enough savings for us. 

For bounds on random codes, our main novelty is to define a random variable Z that counts 
the number of "violations" of the list-decoding property of the code. We then show that Z has a 
exponentially large mean around which it is concentrated w.h.p. This yields that the code cannot 
be list-decodable with high probability, for suitable values of rate and list size parameters. 

1.5 Organization 

We define some useful notation and the formal notion of strong list decodability in Section 2. Our 
main negative result on limitations of strongly list-decodable codes appears in Section 3; for ease of 
readability, the most technical part of the proof is isolated as Appendix ??. We give our short proof 
of Blinovsky's lower bound in Section 4. Our results about the zero-error rate regime for constant- 
weight codes and the reverse connection between list decoding bounds for general codes and 
constant-weight codes appear in Section 5. Finally, our list size lower bounds for random codes 
are discussed in Section 6, with the case of list decoding from erasures appearing as Appendix B. 

2 Notation and Preliminaries 

We recall some standard terminology regarding error-correcting codes. For q>2, let [q] denote the 
set {0, 1, . . . , g — 1}. By a g-ary code, we mean any set C C [g]", where n is called the blocklength 
of C. We will mainly focus on the special case of binary codes corresponding io q = 2. The rate 
R = R{C) is defined to be For x e [g]" and S C [n], we denote by x|5 the restriction of x to 

the coordinates in S. Let supp(x) := {i G [n] : xi / 0}. A subcode of C is simply any C C C. 

For X, x' G [g]", define the Hamming distance between x and x' , denoted d{x, x'), to be the 
number of coordinates in which x and x' differ. The weight (or density) of x e [g]", denoted wt(x), 
is d(0,x), where is the all-zeros vector in [q]^. Also let B(a;,r) denote the hamming ball of 
radius r centered at x; that is, B(x, r) := {x' G [g]" : x') < r}. In this work, we introduce a 
nonstandard extension of the notion of distance to small lists of vectors as follows: for C C [q]", 
define DnjeLx{x,C) ■= max{d{x,x') : x' € C} and D.^^g{x,C) := Eix'^c[d{x, x')]. 

We formalize the error recovery capability of the code using list decoding. 

Definition 1. Fix < p < 1/2 and a positive integer L . 

1. Aq-arycodeC issaidtobe {p, L)-\ist decodahle if for all x G [q]"',wehave |CnB(x,pn)| < L — 1. 
In other words, for any x and any list £ C C of size at least L, we have -Dmax(a;, C) > pn. 

2. C is said to he {p, L)-strongly list decodable if for any x and C as in the -previous item, we have 

Davg(2;,£) > pn. 

3. C is said to he {X;p, L)-(strongly) list decodahle if C is {p,L)-(strongly) list decodahle, and every 
codeword in C has weight exactly An. 

Here the first definition is standard, and the third (i.e., (A;p, L)-list decodability) provides a 
useful notation. Also we emphasize that while formally introduced by us, the notion of (p, L)- 

^\og denotes logarithm to base 2. 
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strong list-decodability property is implicit in [1, 1, 10]. The following claim asserts that this is a 
syntactically stronger notion than standard list-decodability: 

Proposition l.IfC is {p, L)-strongly list decodable, then C is {p, L)-list decodable. 

Proof: Follows from the fact that Dy^sixix, C) always dominates Davg(a^, ^) for all x and size-L lists 
£ of C. ■ 

Following (and extending) the notation in [8], we make the following definitions to quantify 
the trade-offs in the different parameters (error-correction radius p, list-size L, weight of the code 
A and its rate R). Fix 0<p, A<l/2, 0<i?< 1 and a positive integer L. Say that the triple (p, L; R) 
is achievable for (strongly) list decodable codes if there exists (p, L)-(strongly) list decodable codes 
of rate R for infinitely many lengths n. Similarly the 4-tuple (A; p, L; R) is achievable if there exists 
(A;p, L)-(strongly) list decodable codes of rate R. 

Definition 3. Fix < p < 1/2. 

1. Define Lp^^ to be the least integer L such that {p,L; 1 — h{p) — 7) is achievable. Similarly, define 
Rp^L io be the supremum over R such that {p, L; R) is achievable for list decodable codes. Finally, the 
gap to the optimal/limiting rate (ofl — h{p)) is defined to be ^px := 1 — h{p) — Rp^i- 

2. For A G (p, 1/2], define Rp^L{\) to be the supremum rate Rfor which the 4-tuple {X;p,L;R) is 
achievable. 

We can also define analogous quantities for strong list decoding, but to prevent notational 
clutter, we will not explicitly do so. 

Useful properties of standard functions. We collect together several facts and estimates that will 
be useful in our results. The proofs of the standard claims in this subsection will be omitted. 

/'a\ /n — a\ 

We use the notation /(n, a, b, i) to denote • say that a random variable X follows 

the hypergeometric distribution with parameters n, a, h if Pr[X = i\ = f{n, a, b, i). We will need 
the following elementary combinatorial identity involving the hypergeometric distribution. 

Fact 4. For all n, a, b, i, we have f{n, a, b, i) = f{n, b, a, i). 

We will use the following estimates related to the binary entropy function without further 
mention. 

Fact 5 (The binary entropy function). Define the binary entropy function by h{z) := —zlogz — (1 — 
z) log(l — z). Then for any constant z £ (0, 1) and n — )• 00, we have 2^(^)"'~°(") < (^^^^ < 2'^(^)". 

Fact 6. For all z G (0, 1), we have zlog{l/z) + (loge)(2; — z^) < h{z) < zlog(l/z) + (loge)z. 

3 Bounds for strong list decodability 

In this section, we establish upper and lower bounds of 1 — h{p) — 1/L®(^) on the rate for {p, L)- 
strongly list decodable codes. 

3.1 Lower bound on rate. 

The result below follows by a standard random coding argument. 
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Theorem 7. Let < p < 1/2 and L a positive integer. Then for all e> and all sufficiently large lengths 
n, there exists a {p, L)-strongly list decodable code of rate at least 1 — h{p) — 1/L — e. 

Proof: We show that a random code C : {0, 1}^" ^ {0, 1}'^ of rate R = l- h{p) - 1/L - e is (p, L)- 
strongly list-decodable whp. For each m E {0, 1}^", pick C{m) independently and uniformly at 
random from {0, 1}". For any X G {0, 1}" and any distinct L-tuple {mi, rrii} C {0, 1}^", we are 
interested in bounding the probability of the event that D < Lpn, where D := X^j^^ d{x, C{mi)). 
Let X be the {0, l}-string of length Ln obtained by concatenating x repeatedly L times. Simi- 
larly, let Y be the {0, l}-string obtained by concatenating C(mi), . . . , C{mL); then, Y is distributed 
uniformly at random in {0, 1}^" independent of the choice of x. Now, note that D is simply the 
Hamming distance between X and Y . Hence, the probability that D < pLn is at most 2^^^^'^^^^^'^. 

Finally, by a union bound over the choice of x and {mi, . . . , mi}, the probability that the code 
is not (p, L) -strongly list decodable is at most 

for the given choice of R, thus establishing the claim. ■ 



3.2 Upper bound on rate. 

We now show an upper bound of l—h{p)—Cp/L'^ on the rate of a {p, L) -strongly list decodable code. 
The proof is based on a simple idea, but to convert this to a full proof requires some calculations 
and analytic manipulations (involving the hypergeometric distribution and the entropy function). 
To repeat our main idea from the Introduction, instead of picking the "bad list decoding center " 
uniformly at random, we pick it randomly very close to a codeword, and this still gives similar 
guarantees on the number of near-by codewords. Now since the quantity of interest is the average 
radius, this close-by codeword gives enough savings for us. 

Before we proceed with the proof, we first establish a rate upper bound for the special case 
when all codewords are restricted to be of a fixed weight An for a suitably chosen A G (p, 1/2). 
We can then map this bound to the general case by the following standard argument. (We will 
establish a converse to this claim in Section 5.) 

Lemma 8. Let A G (p, 1/2) be such that An zs an integer. If C is a {p, L)-(strongly) list-decodable code 
of rate R = 1 — h{p) — 7, then there exists a {X;p, L)-(strongly) list decodable code C of rate at least 

h{X)-h{p)-j-o{l). 

Proof: For a random center x, the expected number of codewords c G C with d{x, c) = An is 
exactly \C\ ■ {^^J ■ 2"" > 2^" • 2('^(^)-i-°(i))" = 2('»(^)-^(p)-7-o(i))". Then there exists an x such that 
the subcode Cx consisting of all codewords at a distance An from x has a rate at least h{X) — h{p) — 
7 — 0(1). Defining C to be C^: — 2; gives the claim. ■ 
We now state our main result establishing a rate upper bound for {p, L)-list decodable codes. 

Theorem 9 (Main theorem). Let < p < 1/2 and let L a sufficiently large positive integer. Then, there 
exist Qp, Cp > such that the following holds (for sufficiently large lengths n): 

1. IfC is a [p, L)-strongly list-decodable code, then C has rate at most 1 — h{p) — Cp/L"^. 

2. For \ := p + ap/L, if C is a {X;p, L)-strongly list-decodable code, then C has rate at most h{X) — 
Hp) - Cp/L^. 
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Using Lemma 8, it suffices to show the second part. Before we do this, we will establish the 
following folklore result, whose proof illustrates our idea in a simple case. 

Lemma 10 (A warm-up lemma). If C is a {X;p, L)-list-decodable code, then C has rate at most h{\) — 

Hp) + 0(1). 

Proof: The proof is via the probabilistic method. Pick a random subset S C [n] of coordinates of 

size an, with a := (A — p)/(l — 2p) Define the center x to be the indicator vector of S: Xi = I 

i e S. Let C be the set of codewords c G C such that wt(c|s) > (1 — p)an. For any c e C, we have 

d{x, c) = (an — wt(c|5)) + wt{c\^) < apn + (A — a(l — p))n = (A — a(l — 2p))n, 

which equals pn for the given choice of a. Hence C lies entirely inside the ball B(x, pn). 

Now, we want to compute E[|£|]. For any fixed c € C, the probability that c lies in C is at least 

/ an \ / (1 — CK)ri \ 

/(n, An, an, a(l - p)n), which by Fact 4 equals ^d-rt""^ UA-^d-rtjnJ _ y^^^^ ^^^^ ^^^^^^ 

\\n) 

it holds that A — (1 — p)a = p{l — a). Therefore, conveniently, the above expression is equal to 

Therefore, by linearity of expectations, the expected size of C is at least \C\ x 2('^(p)-^(^)-°(i))" = 
2{R+h{p)-h(\)-o{i))n _ On the other hand, the (p, L)-list decodability of C implies that |£| < L with 
probability 1. Comparing the lower and upper bounds on expected size of C, we get R + h{p) — 
h{\) — o(l) < ^ log L, which yields the claim. ■ 

Proof of Theorem 9: At a high level, we proceed as in the proof of Lemma 10, but in addition to 
the bad list C, we will produce a special codeword c* G C such that d{x, c*) is much smaller than 
pn. Then defining a new bad list C consisting of c* and (L — 1) other codewords from C, we show 
that Davg{x, C) is at most pn, which would contradict the strong list decodability of C. 

We now provide the details. Pick a uniformly random codeword c* E C and let 5" be a random 
subset of supp(c*) of size /3n, where /? is a constant to be chosen appropriately later. Let x be 
the indicator vector of S. Define C to be the collection of codewords c G C such that wt{c\S) > 
{I -p)\S\. (Note that c* G £.) Conditioned on c*, the probability that c G £ is 

/3n 



1 ((X-S)n\f Sn \ 



3nJ j=(l-p)/3ri 



where d{c*,c) := 25{c*,c)n = 25n. Observe that Q(c*, c) is really a function of S{c*,c) = 6. There- 
fore, the expected size of C is Ec^gc" [Z^cgc c))] = \C\ ■ Ec,c*ec [<3('^)]- The following claim 
lower bounds the expectation of the random variable Q = Q{6). 



Claim 11 (Estimate of EQ). There exist A := {1 - p) log + plog [y^) and B = Bp £ (0, oo) 

such that for any code C with all codewords of weight A, we have 

Ec.,c[Q('^(c*,c))] >2-(^^+^^')". 



^The reason for setting a to this value will be clear shortly. 
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Proof Sketch: First, note that < 5 < X always. Also, it is easy to see that the quantity Q{5) is 
monotonically decreasing with increasing 6. Moreover, by a simple application of the Cauchy- 
Schwarz inequality, we have Ec'^ci*^] < A(l — A). Now, if Q were a convex function of 6, then we 
could lower bound E[(5(5)] by Jensen's inequality; unfortunately, the convexity assumption does 
not hold. However, it turns out that when 6 is restricted to the "middle" range 

we can approximate Q{S) well by a convex function Q{6). Hence the proof strategy can be made 
to work for Q, except for the extreme values of 6. We then handle the "small" regime (i.e., < 6 < 
Xp + n~^/^) and the "large" regime (i.e., A — < 5 < A) by additional simple tricks. 

The complete proof is quite cumbersome since it involves heavy use of several standard es- 
timates (of binomial coefficients) and Taylor approximations. Moreover, one also needs to verify 
the convexity of Q{6). For ease of readability, we finish the rather technical proof in Appendix A. 
■ 

Let us now proceed with completing the proof of Theorem 9. By Claim 11, assuming R > 
Ap + + 0(1) for a suitable o(l) term, E[|£|] > L. Fix c* and S such that \C\ > L. Let C be 
any list containing c* and L — 1 other codewords from C. For c £ C C, we have d{x,c) < 
Ppn + (A — /3(1 — p))n = (A — /3(1 — 2p))n, whereas d{x,c*) = (A — /3)n. Averaging these L 
distances, -Davgl^;, < (A — /3(1 — 2p + 2p/L))n. Now, pick (3 so that this is at most pn; that is, 
set /3 := (A — - 2p + 2p/L). For this choice of j3, the list C contradicts the (p, L)-strong list 
decodability of C. Thus, contrary to our starting assumption, the rate is at most Ap + i?/3^ + o(l) 
(for the special choice of /?). We can further upper bound this by (see Claim 23 in Appendix A) 

h{X)-h{p)- ^'^\-PK Bo{X-pf 
for some Aq > and Bq < co depending on p. Setting A : = p + Ao/{2BqL) gives the claim. ■ 



4 Bounds for (standard) list decodability 

In this section, we consider the rate vs. list size trade-off for the traditional list-decodability notion. 
For the special case when the fraction of errors is close to 1/2, [10] showed that any code family 
of growing size correcting up to 1/2 — 7 fraction of errors must have a list size 0(1/7^), which 
is optimal up to constant factors. When p is bounded away from 1/2, Blinovsky [1, 3] gives the 
best known bounds on the rate of a {p, L)-list decodable code. He showed that any code of rate 
1 — h{p) — 7 has list-size at least $7p(log(l/7)).^ For completeness we give a self-contained and 
simpler proof of this result in this section. 

Theorem 12 (Blinovsky [1, 3]). 1. Suppose C is {X;p, L)-list decodable code with X = p + ^p^. Then 
\C\ is at most 2L^/A (independent of the blocklength n). (In particular, the rate approaches as 
n — >• 00.) 

2. Suppose C is a (p, L)-list decodable code. Then there exists a constant Cp > such that the rate ofC 
is at most 1 - h{p) - 2~'^p^. 

Proof: By Proposition 8, it suffices to show the first part, since then the rate of C is upper bounded 

by 1 — h{X) = 1 — h{p + \p'") < 1 — h{p) — 0p(^^^p^) (using Taylor expansion). We prove the 

'He states his results in a different form however. The reader is referred to [13] for this form of the result. 
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first part by the first moment method. Assume that |C| > 2L^/A. Pick a random (distinct) L-tuple 
of codewords £ = {c^, c^, . . . , c^} C C, and define x by = 1 iff = 1 for all 1 < j < L. Note 
that X is at a distance of An — wt{x) from each c^, so that E[£'max(2;, = An — E[wt(x)]. Thus to 
complete the proof, it suffices to show that E[wt(3;)] > ^p^. 

Define the function i? : M>o R>o by = (max{^,L-i}^ standard closure properties of 
convex functions, ?? is convex on M>o. Now, let M := \C\ and Mi be the number of codewords 
with 1 in the i^^ position. Then it can be verified that Xi = 1 with probability i9{Mi)/ (^) . Thus, by 
linearity of expectations, the expected weight of x is 



\l) i=l \l) \l) \L 



Here we have used (a) Jensen's inequality, and (b) the fact that AM > 2L^ > L. Finally, a straight- 
forward approximation gives the promised bound: 

■ 

The above method can be adapted for g-ary codes with an additional trick. 

Theorem 13. 1. Suppose C is a q-ary {X;p, L)-list decodable code with X = p + ^p^. Then \C\ is at 
most 21? I\. 

2. Suppose C is a q-ary {p, L)-list decodable code. Then there exists a constant Cp^q > such that the 
rate ofC is at most 1 - hq{p) - 2~'^P'''^ 



Before we prove Theorem 13, we will state a convenient lemma due to Erdos. (See Section 2.1 
of [12] for reference.) This result was implicitly established in our proof of Theorem 12; so we will 
omit the formal proof. 

Lemma 14 (Erdos 1964). Suppose A is a set system over the ground set [n], such that each A ^ A has 
size at least An. Then if\A\ > 2L^/A, then there exist distinct Ai, A2, . . . , Al in A such that Hi^i ^« 
size at least ^nX^. 

Proof of Theorem 13: As in Theorem 12, it suffices to show the first part. Towards a contradiction, 
assume \C\ > 2L^/A. Define the set system A = {supp(c) : c G C}. By Lemma 14, there exists 
an L-tuple {c^, c^, . . . , c^} of codewords such that the intersection of their support, say S, has size 
> |nA^ > ^np^. Arbitrarily partition the coordinates in 5 into L parts {Si, . . . , Sl} of almost- 
equal size n/{2L) ■ p^. Now, define the center x by: 

{cl, if i G Sj, and 
0, if i S. 

Note that x agrees with c' on Sj, so that d{x, c?) < An — ^p^n = pn. Therefore, {c^, . . . , c^} is a 
bad list of codewords contradicting the (p, L)-list decodability of C. ■ 



5 Constant-weight vs. General codes 

In this section, we will understand the rate vs. list-size trade-offs for constant-weight codes, that 
is, codes with every codeword of weight An, where A G (p, 1/2] is a parameter. (Note that setting 
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A = 1/2 corresponds to arbitrary codes having no weight restrictions.) As observed earHer, a 
typical approach in coding theory to establish rate upper bounds is to study the problem under 
the above constant-weight restriction. One then proceeds to show a strong negative result of the 
flavor that a code with the stated properties must have a constant size (and in particular zero rate). 
For instance, the first part of Theorem 12 above is of this form. Finally, mapping this bound to 
arbitrary codes, one obtains a rate upper bound of 1 — h{\) for the original problem. (Note that 
Lemma 8 provides a particular formal example of the last step.) 

In particular, Blinovsky's rate upper bound (Theorem 12)'^ of 1 — h{p) — 2^*^^^) for (p, L)- 
list decodable codes follows this approach. More precisly, he proves that, under the weight-A 
restriction, such code must have zero rate for all A < p + 2"^^^ for some c < oo. One may then 
imagine improving the rate upper bound io\ — h{p) — L~'-'^^^ simply by establishing the latter result 
for correspondingly higher values of A (i.e., up to p + L^'^^^^). We show that this approach cannot 
work by establishing that list-decodable codes of positive (but possibly small) rates exist as long as 
X — p > 2~^^^\ Thus Blinovsky's result identifies the correct zero-rate regime for the list-decoding 
problem; in particular, his bound is also the best possible if we restrict ourselves to this approach. 

In the opposite direction, we show that the task of establishing rate upper bounds for constant 
weight codes is not significantly harder than the general problem. Formally, we state that that if 
the "gap to capacity" for general codes is 7, then the gap to capacity for weight-A codes is at least 

7 (^ x/2^p ) • Stated differently, if our goal is to establish a L^'^(^) lower bound on the gap 7, then 

we do not lose by first passing to a suitable A (that is not too close to p). 



5.1 Zero-rate regime 

We now prove the existence of (p, L)-strongly list-decodable codes of positive rate where all code- 
words have constant weight which is very close to pn. 

Theorem 15. For every < p < 1/2, there exists d = d{p) = |(l/2 — p)"^ G (0,oo) such that for 
all sufficiently large L, there exists a {X;p,L)-strongly list decodable code of rate at least R — o(l) with 
R = e-2<ii and Xe[p,p+ 126"^^]. 

The proof proceeds by random coding followed by expurgation. Set e := 4e~'^^ and A' := p+2e. 
Now, pick a random 2^" x n code matrix C with each entry set to 1 with probability A'. For our 
choice of parameters, we can show that whp, C satisfies the following properties: 

• C is (p, L)-strongly list-decodable. 

• Every codeword has weight (A' it e)n. In particular, the maximum weight is at most (p+ 3e)n. 

Pick a C satisfying these two properties, and let Ci denote the sub-code of C consisting of the 
weight-z codewords. Then, defining i* = An to be the most popular weight, the subcode Ci* 
satisfies our constraints. The formal proof follows. 

Proof of Theorem 15: Set e := ^e~'^^ and A' := p + 2e. Assume that L is large enough so that 
p + 4e < 1/2 and verify that 1/2 - A' > 5 (1/2 - p) in this case. Pick a random code C : {0, 1}-^" 
{0, 1}", where for each y e {0, 1}-^", every coordinate of C(y) is chosen independently to be 1 with 
probability A'. First, by Chernoff bound followed by union bound, the probability that there exists 
y G {0, 1}^" with |wt(y) — X'n\ > en is at most 2^" • 2"^'^ This is our first bad event. 

''For notational ease, we supress the dependence on p in the O and Q notations in this informal discussion. 
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Now, we bound the probability of the occurrence of a bad list of codewords. Fix a list {yi , . . . ,yi,} C 
({0, 1}^")^ and define x to be its centroid: that is, xj is the majority of the L bits {C{yi)j : 1 < i < 
L). By Chernoff bound, for j G [n], the probability that xj = 1 is at most 

g-2(l/2-A')2L < g-|(l/2-p)2L ^ g-Ld ^ ^^4^ 

By a second application of Chernoff bound, the probability that the weight of x exceeds en = (1 + 

3) (en/4) is at most e 3 = e~ ' . Our second bad event is that there exists a list {yi ^yi} 
such that the weight of x is > en. By union bound over all possible lists, the probability of this 
event is at most (^f ) • e-^-/^ < e^RL-^^/^)n_ 

Since R < min{e^, e/2L}, the random code avoids both the bad events with probability 1 — 
2^^p^^\ Fix any such code C (avoiding both bad events). For any list {yi, . . . , yi} C C with 
centroid x, for all 1 < i < L, we have 

d{x, yi) > wt(yj) — wt{x) > (A' — e)n — en = (A' — 2e)n = pn, 

where x is the center of the list as defined above. Therefore, the average distance of the list from 
the center is also at least pn. Hence, the code C is {p, L)-strongly list decodable. Now, using 
the pigeonhole principle, we can find a sub-code with all codewords having weight exactly w 
having size at least 2^"'/(n + 1) = 2(^^°(^))"'. Defining A := w/n, we obtain a (A;p, L)-strongly list 
decodable code of rate R — o(l). Finally, it is clear that X<X' + e= p + 3e<p + 12e~^°'. ■ 



5.2 A reverse connection between constant-weight and arbitrary codes 
Lemma 16. Let 7 = 7^^^ be the gap to capacity for arbitrary codes. Then, for every A € {p, 1/2], 

h{X) - h{p) - 7 < i?p,L(A) < h{X) - h{p) - 7 

Proof: The left inequality is essentially the content of Claim 8; we show the second inequality 
here. Suppose C is a (A; p, L)-list decodable code of rate R. Pick a random subset S of coordinates 
of size an with a = {X — p) / [1/2 — p). (The motivation for this choice will become clear shortly.) 
Consider the subcode C consisting of the codewords c G C such that wt(c|5) > an/2. For our 
choice of a, one can verify that if c G C , then c has weight at most p{l — a)n = p\S\ when restricted 
to 5. 

The key insight is that the code C'^g := {c\s : c G C'} (of blocklength an) is (p, L)-list de- 
codable. Suppose not. Then there exists a center x' G {0, 1}'^ and a size-L list C C such that 
d{x' , c\s) < pan for all c e C Now, extend x' to x G {0, 1}" such that x\s = x' and Xj is zero for 
i S. Then, for c e C, we have d{x,c) < d{x',c\s) + wt(c|;g^) < pan + p{l — a)n = pn. Thus, 
C C B(x,pn), contradicting the {p, L)-list decodability of C (and hence of C). 

By hypothesis, we can bound the size of C|'5 by 2(i-'^(p)-t)"" (with probability 1). On the other 
hand, in expectation, the size of C'^g is at least 

/ An W(l-A)n\ / an \ / {l-a)n \ 

11 \an/2)\ an/2 ) _|^| Von/2A(A-a/2)n/ 

\an/ VAn/ 

appealing to Fact 4 again. Finally, verify that X — a/2 = p{l — a). By standard approximation, 
this quantity is at least exp2[R + a + (1 — a)h{p) — h{X) — o(l))n].^ Comparing the upper and 

*We use exp2(2) to denote 2^. 



\l/2-p 
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lower bound on the (expected) size of C'^g, we get R + a + {1 — a)h{p) — h{\) < (1 — h{p) — 7)0. 
Rearranging this inequality gives the desired bound R < h{\) — h{p) — 07. ■ 

6 List-size Bounds for Random codes 

In this section, we establish optimal (up to constant factors) bounds on the list-size of random 
codes, both general as well as linear. Results of this vein were already shown by Rudra for the 
errors case [13], based on the large near-disjoint packings of Hamming balls implied by Shannon's 
capacity theorems. Here we give a direct proof based on the second moment method.^ In addition, 
our proofs extend easily to give list-size bounds for the erasures case as well. 

By a random code, we mean a random map Enc : {0, l}'^' — )• {0, 1}" where the image Enc(a;) 
of each x G {0, 1}'^ is picked independently and uniformly at random from {0, 1}". On the other 
hand, to obtain a random linear code, we fix an arbitrary basis for the vector space {0, l}'^, and the 
encoding of the basis vectors is chosen independently and uniformly at random. The encoding 
map Enc is then extended for all messages in {0, 1}^ via linearity. 

6.1 Bounds for Random codes under Errors 

As mentioned before, our results proceed directly via the second moment method. Towards this 
goal, we define a random variable Z that counts the number of witnesses (i.e., a bad list of code- 
words together with the center) that certify the violation of the (p, -L)-list decodability property. 
Note that the code is {p, L)-list decodable iff Z = 0. We then show that Z has large expectation 
(i.e., exponential in n) and that Var[2] = exp(— r2q_p^^(n))E[2]^ = o{E[Z]'^). Using the Chebyshev 
inequality, we can conclude that Z > 0, except with an exponentially small probability, which 
gives the claim. 

As a particular example, consider the case of random general codes under errors. Here, we let 
X be an arbitrary distinct L-tuple of messages {xi,X2, . . . ,xi} C C and a be an arbitrary center. 
Then define the indicator random variable I{X,a) for the event that (i(a, Enc(x)) < pn for all 
X ^ X. Finally, define Z := J2x a-^i-^^^)- "^^^ mean and variance estimates of Z follow by a 
standard calculation. Our formal results and proofs follow. 

6.1.1 General codes 

Theorem 17. For every 0<p<l — 1/g and 7 > 0, with probability 1 — q~^p-^('^\ a random q-ary code 
of rate 1 — hg{p) — j is not {p, ^;^)-/fsf decodable. 

Before presenting the proof, let us define some convenient notation. We denote by Bg(a, r) 
the Hamming ball with center a and radius r. Define Volq(n,r) be the volume of Bg(-,r), and 
Hq{n,r) := Vol(n,p)/g". It is a standard fact that < Volg(n,zn) < We will 

use B(a) (resp. fi) as a shorthand to denote Bg(a,pn) (resp. ^q{n,pn)). 

Proof: At a high level, we apply the second moment method to the random variable Z that counts 
the number of witnesses (i.e., a bad list of codewords and the corresponding center) certifying the 
violation of the (p, L)-list decodability property. Consider a random code Enc : [gi]'==^" 

'We remark that the argument in [13] is also based on the second moment method, but applied to a more complicated 
random variable. 
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with R = I — hq{p) — 7. For a list of L messages X = {xi,X2, . . . ,xl} C {0, l}'^, and a G {0, 1}", 
define the indicator variable I{X, a) to be 1 iff Enc(a;) G B(a) for all x e X. Then define Z := 
^ I{X, a). Clearly, Z > iff the code is (p, L)-list decodable. 

For every x and every a, the event Enc(x) G B(a) occurs wp iJ,q{n,pn) = n; therefore, by 
independence, for a list of messages X, we have E [I{X, a)] = fi^. Therefore, by the linearity of 

expectations, E [Z] = iJ^^[\)q^ > (q'^fj)^ Q^- For two lists of messages X and Y, say X ~ y if 
X nY ^ Clearly, if X '/^ Y, then the events I{X, a) and /(Y, 6) are independent. Therefore, we 
have 



Var[Z] = ^ ^(E[/(X, a)/(y, b)] - E[7(X, a)]E[/(y, b)]) 

X,Y a,b 

< «)^(^' = E E = 1 ^"'i ^(^' ^) = 1] 

X~y a,b X~y a,6 

= g2n ^ Pr„,b,E„c[/(X, a) = 1 and I{Y, b) = 1], 



where, in addition to the randomness in the code, the centers a and b are also picked at random. 

Fix a pair {X, Y) such that \X nY\ = £ > 0. Let 2; G X n F be arbitrary. Then for any a, b, the 
event I{X, a) = I{Y, b) = 1 implies that 

• Enc(cc) G B(a) for x G X\{z}; 

• Enc(y) G B(5) for y G Y\X) 

• {a,b} C B(Enc(z)). 

Thus this event happens with probability at most Finally, summing over all the pairs 

(X, Y) with £> (the niunber of such pairs is at most L^'^g'^^^^"^)), 



Var[Z] < q2n^L^L^k(2L-e)i^2L-e+i < (EZ)2, 
e=i e=i 

after some rearrangement. Note that q'^fi = q~'^" for our choice of the rate. Therefore, 

L 

Var[Z] < ^L^^g^^"/x • (BZf < L^L+i^^Ln-{i-h,{p))n . ^j,2;)2. 



Therefore, letting L = (1 - hq{p))/{2-i), we observe that Var[2:] = q'^v-ii^^) [-^Zf . Finally, by 
Chebyshev's inequality, Z = (i.e., the code is {p, L)-list decodable) with probability ■ 



6.1.2 Random linear codes 

We now turn to the case of random linear codes. 

Theorem 18. For every Q<p<\ — 1/q there exists 5q^p > such that for all 7 > 0, with probability 
1 — q-^p--i('^\ a random q-ary linear code of rate at least 1 — Hq(p) — 7 is not {p, 5q^p/ {2'y))-list decodable 
with high probability. 
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Proof: We follow the same outline as in Theorem 17; we will only highlight the differences. Con- 
sider a random linear code of dimension k = {1 — Hq{p) — 7)72. We define I{X, a) in an identi- 
cal manner, but only for the linearly independent lists of messages X. (The definition of Z is un- 
changed.) Furthermore, for a pair of lists X and Y, define i = £{X, Y) := dim(span(X) n span(y)) 
(rather than the size of their intersection). Moreover, we say that X ~ y iff £ = 0; that is, iff 
span(X) and span(y) have a nontrivial intersection. 

Estimating E[Z] as before^", we get E[Z] > ^ • L^^{q^ii)^q^. Also, we can write 
Var[^] < ^ Pr„,t,,Enc [I{X, a) = 1 and I{Y, b) = I] . 

Fix a pair X, Y such that dim(span X n span Y) = I > Then, there exists Z C y of size L — I, 
such that X ~ Z and Y C span( X\J Z). Let yo £ Y\Z be arbitrary. Then, since j/o £ span( X U Z), 
we have yo = I^uexuz CC"")^ for some scalars C,{u). Note that it is possible that yo lies in the span 
of X. But, since Y is an independent set, yo cannot be written as a linear combination of vectors 
from Z C y\{yo}. Hence, there exists some u ^ X with C,(u) 7^ 0. 

In order to compute the desired probability, condition on the event that Enc(n) G B(a) for 
u ^ X and Enc(ii) G B(6) for u G B(6). We may re-express this as Enc(u) = b{u) + o for u £ X 
and Enc(n) = 6{u) + 5 for u e Z. We thus get a family of iid random variables {S{u)}uexuz, each 
of which is uniformly distributed inside B(0). Further they are also independent of a and b. In 
terms of the 6{-)'s, we have Enc(yo) -b = jZuaxvjz C{u)5{'^) + C{^) a + {c\z) - 1) b. 

We claim that the conditional probability that Enc(yo) — 6 G B(0) is at most q'^P'i"'. We discuss 
two cases: 

1. Suppose C{X) 7^ or ({Z) ^ 1. Then conditioned on ^(O's, the random variable Enc(yo) — b 
is distributed uniformly at random and hence falls inside B(0) with probability 11. 

2. Suppose C(-^) = and C,{Z) = 1. In this case, Enc(yo) — 6 is simply a sum of some number of 
points uniformly sampled from the ball B(0). Notice that since Y is not linearly dependent, 
we must have C{x) / for some x £ X. Also, since C{^)'^ sum to zero, there are at least two 
nonzero C{x)'s. Therefore, Enc(yo) — bis the sum of / > 2 random points chosen uniformly 
from B(0). We use the following fact: that there exists 6q^p > such that, if wi,W2, ■ ■ ■ ,wi 
are I > 2 independent and uniformly random samples from B(0), then the probability that 
W1 + W2 + ■■ ■ + 'wi is also inside B(0) is bounded by q^^i-p"'. Thus, the stated event also occurs 
with probability q^^i'P'"'. (Without loss of generality, we may take q~^i'P"- to be larger than //.) 

Therefore, the conditional probability is at most 2"'^'''P". Thus, Var[Z] < g^" I]x~y fj,'^^^~^\~^i-p'^. 
Proceeding as before, we get Var[Z] < 0{L^^^^q^'^^~^i'p'^'"''E[Z]'^). The conclusion follows simi- 
larly. ■ 

6.2 Bounds for Random codes under Erasures 

To model erasures, we augment the alphabet [q] with the erasure symbol * to get [q]^ := [q] U {*}. 
For a G [q]^, define supp*(a) to be the set of all indices i such that / *. Let £q{n, r) be the set 
of a G [g]" such that |supp*(a)| = n — r. Say that a, 6 G [g]" agree with each other if aj = bi for all 

i G supp*(a) n supp*(5). 

^"Here we must be careful to sum over only the linearly independent L-tuples X. 
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Definition 19. A code C C {0, 1}" is said to he {p, L)-erasure list decodable if for all a G £q{n,pn), at 
most L — 1 codewords in C (treated as strings over [q\^) agree with a. 

We now state our results showing limitations of erasure list-decodability of random and ran- 
dom linear codes. 

Theorem 20. for every < p < 1 and 7 > 0, with probability 1 — q^^p^ii'^), a random code ofblocklength 
n and rate at least 1 — p — j is not {p, ^^)-erasure list decodable. 

Theorem 21. Let qbe a prime power. Then there exists a constant Cq > such that for every < p < 1 
and 7 > 0, with probability 1 — q~^p^-i(^), a random q-ary linear code of rate at least 1 — p — j is not 

{Pj Q )-erasure list decodable with high probability. 

Note the exponential gap in the list size for linear and general codes under erasures. We 
present the proofs for the erasure case in Appendix B. 
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A Rate upper bound for strong list decoding 

We now finish the proof of Claim 11 which was used in the proof of our main result (Theorem 9) 
on the rate upper bound for strongly list-decodable codes. 

Proof of Claim 11: (Continued from Section 3.2) 
Divide the range of 6 into three regimes. 

Small 6:0<6<Xp + rT^I'^. We claim that in this regime, Q{b) > 2~°("). To see this, set 



It is easy to see that i > /3(1 and that i = j{X — d)n-\-o{n) for all 5. (Here, |(A — 5) n represents 
the expected weight of S.) Now, Q{S) is at least 



Large 5: A — < 5 < A. In this case, Q{6) can be very small, which affects the expectation. 
However, we can upper bound the probability of this event by Markov inequality: 




!^{X-6)n, if < (5 < Ap, and 
/3(1 — p)n, if Xp < 6 < Xp + n 



1/4 




For the prescribed choice of i, by Stirling's approximation, we can verify that Q{S) > 2 





Expressing this probability in terms of "rate", we get 



n 



1 



logOW > (A - S)h (<1^) + SH - Xh (^) - <,(!) 
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Lower bounding this using Fact 6, we get 

^logQ(5) > /3[(i-p)log-^^+plog^-logA]-/32(loge)f^\:^ + ^')-o(l). 
n (1— P) P \ X — J 

= /3[(l-p)log(A-5)+plog5 + %)-logA]-/32(loge) (^il^ + . 

When 6 is restricted to the middle regime, verify that ^^j^ + ^ = Op{l), independent of A and 5. 
Therefore, 

which we define to be Q{S). Note that, conveniently, Q{S) is a polynomial function of S. 

The key claim is that in the desired range Ap+n~^/^ < S < A — A^/2, Q{5) is a both monotonically 

decreasing and conr;ex in 5. Clearly, it suffices to show these two properties for the function Q{6) := 
(A — 6y^S'^^, where we have set ri := /3(1 — p)n and T2 := jipn for ease of notation. 

1. Monotonicity. Differentiating the function wrt 5, we get 

= (A - 5Y'-H-^-^ [t2(A -5)- n5]. 

do 

For our parameters, r2(A — 5) — ti5 = /3n[pA — 5] < —f5n^/^ < 0. Thus Q is monotonically 
decreasing. 

2. Convexity. Differentiating twice wrt 6, we get 

^^(5) = (A - 5)^1-25-2-2 [(ri5 - r2(A - 5))2 - - r^iX - 5f]. 
For our choice of ri and T2, this simplifies to 

Finally, since 5 — pX> n^^/^, this expression is bounded below by (A — 5Y^^'^ 5'^'^^'^ [fi'^rfi/'^ — 
2n]. For fixed /3 and sufficiently large n, this is nonnegative, establishing the convexity of Q. 

Now, to complete the proof, we essentially apply Jensen's inequality in the middle range. It is 
useful to consider two separate cases. 

1. Suppose Pr [small 5] = Pr[5 < pX + n^^/^\ > ^/n. Then restricting ourselves to this range, 
we have E[Q] > ^ • 2-°(") = 2-°("). 

2. On the other hand, suppose that the small values of 5 have a probability at most 1/n. Con- 
ditioning on the event that 5 is in the middle or high range (i.e., 5 > Xp + n^^/^), we have 
E[5 I middle or high range] < ^^^^^^ = \(\ _ ^) _|_ o(l). Now, further conditioning on the 

n 

middle range, the expectation can only go lower. That is, 

E[5 I middle range] < E[(5 1 middle or high range] < A(l — A) + o(l). 
Moreover, the probability of the middle range is at least A/2 — o(l) > A/4. 
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Therefore, 

E[Q] > (A/4)E[Q(5) I middle range] > (A/4)E[Q(5) | middle range]. 

Applying Jensen to the convex function Q, we get 

E[g] > (A/4)Q(E[5 1 middle range]) > (A/4)g(A(l - A) + o(l)). 

The final inequality uses the monotonicity of Q and the fact that the conditional expectation 
of Q is at most A(l - A) + o(l). Finally, it remains to estimate Q(A(1 — A) + o(l)). Plugging 
in A(l — A) in place of 5, we see that 

{l-p)\og{X-5)+p log 5 + h{p) - h{X) = il-p) log(A2) + p iog(A(l - A)) + hip) - /i(A) , 
which on rearranging equals A := {1 — p) log {^^-^^ + plog (i^a) • Therefore, Q{\{1 — A) + 

o(l)) > 2-(^'5+Op(/32))„_ 

Therefore, E[(5] is at least the minimum of the two estimates, which is > 2~("^^+^''^)'*. ■ 
Claim 22. Suppose ^ := (1 - p) log (^^) + plog ■ T/zen, 

\-p p 
Proof: Applying the inequality hiz < z —1 with z = we get 

since p < 1/2 and e < 4. Plugging this in the definition of A, and also using X> p, we get 
A < (1 - 2p) log(^^) + 4p(A - p) < (1 - 2p)/t'(p) + 2(A - p). 

On the other hand, by the Lagrange Mean Value Theorem, there exists G {p, A) such that 
(/i(A) — h{p)) = h'{^){\ — p). Since h' is monotonically decreasing in (0, 1/2), we have 

Finally, we have 

h!{\) = h'ip) - \h"{z)\dz = h'ip) - ^^^3^dz > h'(p) - ^1^^ > h'(j>) - |(A - p), 

again using e < 4. 

Plugging in both these estimates, we get 

A-(l-2p) ''''^|^^''^» < (i_2p)ft» + 2(A-p)-(l-2p)ft» + ^ii^(A-p) 

< (2,!(1_MkA-p). 
Finally, using the obvious inequalities 2 < 1/p and 4(1 — 2p) jp < A/p, we get the result. ■ 
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Claim 23. Suppose A := {1 — p) log + plog ( Av) ^^'^ ^ — ^(p) < ^ sufficiently 

small and /3 := (A — p)/(l — 2p + e). Tfen 

+ < /i(A) - h{p) - Ao(A - p)e + So(A - p)^ 

for some > and < 00 depending on p (and independent of e and A). 

Proof: From Claim 22, we have 

\ — p 



A^ < 



i^(MA)-Mp)) + 5^ 

A — p p 

1 - 2p , , , , 5(A -p)^ 

(/i(A) - + 



l-2p + e 



l-2p + e^ ^ ^ ^ " p{l - 2p) 
Assuming e < \ — 2p, we can upper bound this by 

AO / l-2p-e/2,^,,, ,,,, 5(A-p)2 

Now, by the convexity of /i(-), we have 

/i(A) - h{p) ^ h{l/2) - h{p) ^ 2(1 - h{p)) 

X-p - 1/2 -p l-2p ' 

Therefore, we have 

^ - ^ ^ ^-^^ 2(1 -2p) l-2p p{l-2p) 

< MA) - MP) -.(A -P)A^ + 



(1 - 2p)2 p(l - 2p) ' 



Also, 5/32 < Therefore, 



Ap^Bf < KX)-Hp)-eiX-p)^,H^^^+^.)iX-pf. 
Therefore the claim holds with Aq := p^^^ and Bq := ^(jz^ + (tJ^- ■ 

B Bounds for Random codes (Erasure case) 

We recall the notation. Let [q]^ := [q]U{*}. For a G [g]", define supp* (a) to be the set of all indices i 
such that Oj / *. LetiSg(n,r)bethesetof a G [g]" such that | supp* (a) | = n — r. (We have \£q{n,r)\ = 
(")g"~''.) Say that a,b e [g]" agree with each other if ai = hi for all i e supp* (a) n supp* (6). Finally, 
we will abbreviate 1 — phy a and £q{n, pn) by £. 
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B.l Random General codes 

Theorem 24 (Theorem 20 restated). For every < p < I and 7 > 0, with probability 1 — 2~^p-''^'^\ a 

27 



random code ofblocklength n and rate at least 1 — p — 7 zs not {p, \^)-erasure list decodable. 



Proof: Consider a random code ofblocklength n and size 2^ , where k = {a — 'y)n, where a = l—p. 
For a list of L messages X = {xi,X2, ■ ■ ■ ,xl} ^ {0, l}'^, and a £ £, define the indicator random 
variable I{X, a) to be 1 iff Enc(x) agrees with a for all x ^ X. Let Z := J2x a H-^-: o-)- It is clear 
that the code C is (p, L)-erasure list decodable if and only if Z = 0. 

For every X and a, we have Pr[I{X,a) = 1] = 5-"-^^", so that B[Z] = ( "p)g"'' > 

L^^(7~(""~'^)^g""(^)^ using standard approximations. Also, 

Yar[Z] < ^ [I{X, a) = 1 and I{Y, b) = 1] . 

xny^0 a,b 

Fix an arbitrary pair {X, Y) with |X n y| = i > 0. Further, let S, T denote the supports of a 
and b respectively. Now, suppose I{X, a) = I(Y, b) = 1. Then, for an arbitrary z ^ X r\Y, Enc(2;) 
agrees with both a and b. Since Enc(z) is a string over {0, 1} (not involving *), this implies that 
a, b must themselves agree with each other. 

The event I{X, a) = I{Y, b) = \ requires that the encodings of points in X\Y (resp., Y\X) 
agree with a (resp. b), whereas for z ^ X r\Y , Enc(2;) must agree with both a, b. Therefore, the 
probability of this event is at most 

^-(|5||X\r|+|T||y\X|)^-|5UT||Xny| ^ ^-2a{L-i)n^-\S\JT\i 

Summing over all pairs (a, b), and noting that the number of pairs (a, b) such that supp*(a) = 5, 
supp*(6) = T, and a\sr\T = b\snT is equal to g'"^^^', we get 

J2 Pr IHX, a) = 1 and /(F, b) = 1] = ^ ^-2a{L-£)„^-|5uT|^^|5uT| 

a,b S,T 

\ 2 

n 



an) 

2 



pn J 



Finally, summing over X, Y pairs with X r\Y ^ %, we get 

L 



Var[Z] < yL^L^mL-^)q-an{2L-i)^anfnV 



< L2L+lg(7i-a)nE[Z]2. 

Therefore, for L = a/27, we have Var[Z] = q~^p-i^'^^'E[Z]'^, and we are done. 
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B.2 Random Linear codes 



Theorem 25 (Theorem 21 restated). There exists a constant Cq > such that for every < p < 1 and 

> 0, with -probability 1 — q p-'''^^' , a random linear code of rate at least 1 — p — j is not {p, 27 y 
erasure list decodable with high probability. 

Proof: First of all, note that if we demonstrate a bad list containing L linearly independent points, 
then it automatically implies a general bad list of size q^^^. This follows from the fact that if 
ci,C2, ■ ■ ■ ,cl agree with a, then any linear combination Cici + . . . + Clcl also agrees with a, as long 
as Ci + C2 + ••• + Cl = 1- (Note that the number of such linear combinations is exactly q^^^.) 

Consider a random linear code C of dimension k = {a — 7)n, where a = I — p. For a linearly 
independent set of L messages X C {0, 1}'^, and for every a G £q{n,pn), define I{X,a) to be the 
indicator random variable for the event that Enc(x) agrees with a for all x G X. Also, define Z to 
be Ex a ^(^' a). For a fixed X and a, E [I{X, a)] = g""^". Summing over the linearly independent 
L-tuples X, we get 

E \Z] > -L-^q^^ ■ g-"^" • f 

2 \np 

Define i = i{X, Y) := dim(span(X) n span(y)). For a pair X, Y of lists, say X ~ 1" if span(X) 
and span(y) have a nontrivial intersection; that is, ^ > 0. If X 9^ 1", then X and Y are linearly 
independent of each other. In turn, the random variables I{X, a) and I{Y, h) are also independent 
of each other. So, we get 

\ar[Z] < J] [^(^' = ^ ^^'^ ^) = ^] 

Fix a pair X, y such that dim(span X n span y) = £ > 0. As in Theorem 18, we define Z and 
yo G yV^' and write yo = Euexuz Cl^)"" for some scalars C,{u). For any a,b e £, let 5 = supp*(a) 
and T = supp*(6). (Note that for general codes, for the event /(X, a) = I{Y, 6) = 1 to occur, the 
strings a and b had to agree with each other on n T; this is not so for linear codes.) For any 
X G X, conditioned on the event Enc(x)|s = a\s, the random variable Finc{x)\x\s is uniformly 
distributed over {0, 1}I^\^I. Since yo = Y^xex Ci^) ^ + E^ez C(^) z with ({x) / for some x £ X, 
it follows that Enc(yo)|T\s is also uniformly distributed over {0, Ijl^^'^L Hence, conditioned on 
the event that Enc(x) agrees with a for all x G X and Enc(z) agrees with b for all z £ Z, the 
probability that Enc(yo) agrees with b is at most g~l^\'^L Hence, 

J]Pr [/(X,a) = I{Y,b) = 1] < q-<^^-') .Y^q\S\+\T\q-\T\S\ ^ 
a,b S,T 

an(e-2L) 2anl ^ V -E,^ rr. \ n-^Sl 



S,T 

np^ 



where the expectation is over S,T CI [n] of size (1 — p)n, chosen independently and uniformly 
randomly. By Lemma 26, 



y Pr [/(X, a) = ICY, 6) = 1] < g-"«(2i-^)g2an ( n\ 



2 

-Cgp{l-p)n 
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for some c > 0. The variance of Z can thus be bounded by 

Var[Z] < ^i:2Lgfc(2L-€)gan(^-2L)^2an 



n 
np 



^in —Cqp(l—p)n I {k—an)L an 



£=1 



q 



-Cqp(l-p)n 



n 
np 



where the siunmand is maximized again for i = L. For k = {a — '-f)n, letting L = (cq/2) • p(l — 
p) /7, we have Var Z = (EZ)^. We are thus done by an application of the second moment 

method. ■ 

Lemma 26. There exists Cq > (independent ofn and p) such that if S, T are independent random subsets 
of[n\ of size {1— p)n, then 



E 



S,T 



-\T\S\ 



Proof: We prove this by thresholding on the value of I r\ 51. By symmetry, the quantity E5 T [q l^^'^l] 

is the same as where S is fixed to be {1, 2, . . . , (1 — p)n}. In this case, the random 

variable |T n S"! has the hypergeometric distribution with mean (1 — p)'^n. We will first upper 
bound the probability of the event that < ^p{l — p)n, which is equivalent to the tail event 

\T D S\> E[|r n S\] + — p)n. By a standard Hoeffding boimd for hypergeometric variables. 



Pr 



\T\S\<-p{l-p)n 



Pr 



\TnS\>{l- pfn +^p-{l- p)n 



< 



I — p 



l-p/2 



P 

p-p/2 



. ,l-p/2 

2-((l-P/2)l°g^-P/2)n 

It can be checked that in the interval [0, 1), the inequality 

1 — v/2 

(l-p/2)log^-f^>7p/10 

holds, so that the tail probability is given by Pr [|T'\S'| < ^p{l — p)n] < 2~p^/^. Finally, the expec- 
tation is boimded as 



E 



2-in5i 



< Pr 



\T\S\<^p{l-p)n 



1 + Pr 



|T\5| > ^p{l-p)n 



■ q 



for Cq = 1/(5 log g). 
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